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(54) Translationmethod and system 

(57) The present invention provides a machine 
translation method and system that together improve 
the accuracy of the selection ot appropriate words, with- 
out incurring any deterioration of the processing efficien- 
cy 

During the translation of a document by using a 
compound word dictionary, elemental word information 
ot an applied compound word is registered in a dis- 
course dictionary, and to translate the document without 
using a compound word dictionary, a plurality of diction- 
aries, including the discourse dictionary, are employed. 
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De ripti n 

Field of th Invention 

[0001] The present invention relates to a translation 
system for which high speed processing Is required, and 
In particular to a translation method and system for im- 
proving the accuracy of the selection of an appropriate 
word in machine translation, without incurring any dete- 
rioration of the processing efficiency. 

Background Art 

[0002] As a consequence of the WWW expansion of 
the Internet, opportunities for using documents ex- 
pressed in foreign languages have increased. And since 
many users desire to scan documents in their native lan- 
guages, there is a growing demand for low priced ma- 
chine translation software. However, the quality of the 
text provided by current machine translation software is 
not satisfactory, and there are many translation errors. 
[0003] Since for a connection on the Internet a trans- 
lation system must initiate a translation process in real 
time, high speed processing is required and the per- 
formance of complicated procedures, such as deep se- 
mantic analysis, is difficult. Generally, therefore, such a 
system is equipped with a dictionary to reduce the 
number of unknown words, and for document scanning, 
more or less ambiguous translations are prepared that 
are at least prevented from straying to far from the point. 
To avoid complicating the process and to increase the 
accuracy of a translation, the data structure of such a 
dictionary tends to be relatively simple, and word trans- 
lations tend to be registered not only as individual word 
units (a single word dictionary) but as compound word 
units (a compound word dictionary). During a transla- 
tion, since the simple data structure has a poor word 
selection function, when there are words for which the 
translation is registered by the units of compound words, 
the selection of the translation registered for the com- 
pound word unit frequently results in a better translation. 
[0004] Further, in general isolated translation of indi- 
vidual sentences is performed. As a result, for a specific 
word that is repeatedly used in a plurality of locations in 
the same text, there may also be given a plurality of 
translations; for one location a translation may be se- 
lected from an entry in a single word dictionary, and for 
another location a translation may be selected from an 
entry in a compound word dictionary. 
[0005] To resolve this problem, according to a ma- 
chine translation method disclosed in Japanese Unex- 
amined Patent Publication No. Hei 3-135666, in a trans- 
lation process information concerning a translation that 
is obtained as the result of a dictionary search is saved 
in a main memory, and is re-used for the same word, so 
that the tim spent searching a dictionary located in an 
auxiliary storage device is saved and so that the trans- 
lation of the word is consistent. With this method, how- 



ver, when an incorr ct translation is first select d for a 
word, the incorrect translation is used in all the locations 
in a document at which that word appears. 
[0006] For a method employed for the processing of 
s a plurality of sentences, which is disclosed in Japanese 
Unexamined Patent Publication No. Hei 2-228765. for 
a document consisting of a plurality of sentences, the 
inherent ambiguity of each sentence is calculated and 
translation is initiated for the least ambiguous sentence. 
10 The results obtained for a polysemous word in a pre- 
ceding sentence are used for a succeeding sentence in 
order to increase the accuracy in the selection of an ap- 
propriate translation and in order to provide a consistent 
translation. This method, however, is premised on the 
IS assumption that a translation will be output after all the 
sentences in the document have been processed, and 
thus it can not be employed for a process by which sen- 
tences are successively translated from the beginning 
of a document, as when a translation process is initiated 
in real time while a system is connected to the Internet. 
[0007] It is an object of the present invention to pro- 
vide a technique which alleviates the above drawbacks 
of the prior art. 

[0008] According to the present invention we provide 
a translation system for performing translation using a 
plurality of dictionaries, comprising: (a) means for reg- 
istering in a discourse dictionary, during the translation 
of a document by using a compound word dictionary, 
elemental word information of an applied compound 
word; and (b) means for employing a plurality of diction- 
aries, including said discourse dictionary, in order to 
translate a word in the document that is not defined in 
the compound word dictionary. 
[0009] Further according to the present invention we 
provide a translation method for performing translation 
using a plurality of dictionaries, comprising the steps of: 
(a) during the translation of a document by using a com- 
pound word dictionary, registering in a discourse diction- 
ary, elemental word information of an applied compound 
word; and (b) employing a plurality of dictionaries, in- 
cluding the discourse dictionary, in order ta4fanslate a 
word in the document that is not defined in the com- 
pound word dictionary. 

[0010] Also according to the present invention we pro- 
vide a storing medium for storing a program for perform- 
ing translation using a plurality of dictionaries, the pro- 
gram comprising: (a) a function for, during the transla- 
tion of a document by using a compound word diction- 
ary, registering in a discourse dictionary, elemental word 
information of an applied compound word; and (b) a 
function for employing a plurality of dictionaries, includ- 
ing the discourse dictionary, in order to translate a word 
in the document that is not defined in the compound 
word dictionary. 

[0011] It is, therefore, one object of the present inven- 
tion to provide a machine translation method and sys- 
tem that together improve the accuracy of the selection 
of appropriate words, without incurring any deterioration 
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of tho proc ssing fficiency. 

[001 2] According to an embodiment of the present in- 
vention we provide a translation method and system 
that, only when a sentence for translation is selected by 
a user and without requiring th employment of a com- 
plicated process, automatically examines the definitions 
of words to select preferred words and can thus improve 
the accuracy of a translation. 

[001 3] According to an embodiment of the present in- 
vention we provide a translation method and system that 
can translate words in consonance with context, without 
requiring a complicated process, such as a grammatical 
d scription process. 

[0014] According to an embodiment of the present in- 
vention we provide a system that, for candidate words, 
accumulates preference information, which is obtained 
as a discourse dictionary during the translation of a doc- 
ument, and employs that Information as a personal dic- 
tionary to automatically study the preferences of candi- 
date words. 

BrI f Description of the Drawings 

[0015] Fig. 1 is a flowchart showing the translation 
process according to the present invention. 
[0016] Fig. 2 is a flowchart showing the discourse dic- 
tionary registration process according to the present in- 
vention. 

[0017] Fig. 3 Is a flowchart showing the single-word 
determination process according to the present inven- 
tion. 

[0018] Fig. 4 Is a flowchart showing the re -translation 
process according to the present invention. 
[0019] Fig. 5 is a diagram illustrating an example 
hardware arrangement of a translation system accord- 
ing to the present invention. 

[0020] The present Invention will now be described by 
employing examples. 

* When a compound word dictionary is employed to 
translate a document, a discourse dictionary prep- 
aration method is employed whereby elemental 
word information of a compound word that is em- 
ployed is registered in a discourse dictionary. 

[0021] Assume that data "'civil trial' 'minji saiban'" 
is present in a compound word dictionary. When this da- 
ta is applied for the translation of a sentence, . .a book 
about the civil trial with . . in a document, candidate 
word information for "civil" and "trial" is registered in the 
discourse dictionary so that this information is reflected 
in the translation of the document. 

A discourse dictionary description method Is em- 
ployed for describing elemental words, their candi- 
date words, and pref r nc s for these candidate 
words as the elemental word information for a com- 
pound word to be described In the discourse dic- 



tionary, 

[0022] In the above example, the candidate words for 
■civil" and "trial" and their preferences (e.g., "'trial' 
s 'saiban' 1 .0") are described In the discourse dictionary. 

Employed is a candidate word selection method 
whereby, to determine a translation for elemental 
words of a compound word to be described in a dis- 
course dictionary, a candidate translation obtained 
from a single word dictionary for the elemental word 
is compared with a candidate translation for the 
compound word, and the candidate word that has 
the most nearly identical character string portion is 

^5 selected. Further employed is a registration ade- 
quacy determination method whereby registration 
of a compound word in a discourse dictionary Is 
cancelled when the ratio of the identical character 
string portion in the candidate word does not ex- 

20 ceed a threshold value. 

[0023] In the above example, in a single-word diction- 
ary, the following candidate words are entered for "Hrial": 

25 trial 0200 N kohanN + jN = jNOCONJG 
trial 0201 N salbanN + jN = jNOCONJG 
trial 0202 N tameshlN + jN = jNOCONJG 
trial 0203 N shikenN + jN = jNOCONJG 
trial 0204 N shikoN + jN = jNOCONJG 

30 trial 0205 N shiyoN + jN = jNOCONJG 
trial 0206 N shirenN + jN = jNOCONJG 
trial 0207 N koteshirabeN + jN = jNOCONJG 
trial 0208 N shInkuN + jN = jNOCONJG 
trial 0209 N shinrlN + jN = jNOCONJG 

35 trial 021 0 N +eABST kokoromlN + jN = jNOCONJG 
trial 021 1 ADJ shikenteklAP + jAN = jTYPENA 

[0024] The translation "minji saiban," for the com- 
pound word "civil trial," is compared with character 
40 strings of the individual candidate words, and the can- 
didate word "saiban," which has the largest identical 
portion, Is selected as a candidate word to be registered 
In the discourse dictionary. Candidate words for "civil" 
are as follows: 

45 

civil 0200 ADJ shiminN + jN + jNOCONJG 
civil 0201 ADJ minkanN + jN + jNOCONJG 
civil 0202 ADJ reigi tadashiiADJ + jADJ + jKEI 
civil 0203 ADJ joyoN + jN + jNOCONJG 

so 

[0025] There are two candidate words "shimin" and 
"minkan," which have character string portions identical 
to the translation "minji saiban" of the compound word 
"civil trial," and for both candidate words, the ratio for 
55 the identical portion to the character string portion is 
50%. Wh n a threshold value set In advance is lower 
than this value, both candidate words ar registered, 
and when the threshold value is higher than this, neither 
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word is registered. 

Employed is a selection determination method is 
employed, whereby a numeral value, which is ob- 
tained by multiplying the word length of a compound s 
word by the ratio of the identical character string 
portions of the candidate word to that of the com- 
pound word, is employed as a preference for the 
candidate word for the elemental word, and where- 
by, when the same candidate word has been regis- io 
tered with the same headword, a new preference 
value is added to a preference value that has al- 
ready been provided. 

[0026] In the above example, the ratio (= 1) for the ^5 
identical portion (= "saiban") of the candidate word 
"saiban" to the translation "mlnji saiban" is multiplied by 
a coefficient according to the word length of a compound 
word (the word count in 'civil trial"; two in this case). A 
gr ater coefficient is set as the word length is increased. 20 
In this case, the coefficient is the square root of N-1 
when the word length is N. The obtained value (1.0 in 
this case) is used as the preference for the candidate 
word "saiban," for the word "trial" in "civil trial," When 
the same headword "trial" is already present in the dis- 25 
course dictionary, and its candidate word "saiban" is al- 
so registered, the above obtained preference 1 .0 is add- 
ed to the preference that is already given for the regis- 
tered word. 

30 

* According to an embodiment of the present inven- 
tion, employed is a candidate word selection meth- 
od whereby, when a specific sentence is to be trans- 
lated, a discourse dictionary is referred to for a word 

for which a compound word dictionary can not be 35 
employed, and whereby, if a headword exits, a reg- 
istered candidate word to which the highest prefer- 
ence is given is selected. 

* According to an embodiment of the present inven- 40 
tion, employed is an automatic learning personal 
dictionary preparation method, whereby a dis- 
course dictionary consisting of units of translated 
sentences (e.g., one WWW page, one article, etc.) 
and a plurality of discourse dictionaries, which have 45 
been prepared for various sentences translated by 

a specific person, are merged to create an automat- 
ic learning personal dictionary. 



discourse dictionary is referred to first, and then 
an automatic teaming personal dictionary is re- 
ferred to. 

* According to an embodiment of the present inven- 
tion, employed is a method whereby, to translate a 
specific document, candidate words into which in- 
dividual words in a document were translated are 
recorded, the contents of the translation of the text 
are compared with a generated discourse diction- 
ary when the translation is completed in order to 
evaluate how the first translation may be changed 
by re-translation, the evaluation results are provid- 
ed for a user, and in response to a user's request, 
the translation is performed again by using the gen- 
erated discourse dictionary. 

Employed is a translation results recording 
method whereby headwords (words in a source 
language) and their candidate words, and the 
count of the headwords that were translated in- 
to the candidate words (or the sentence 
number of a translated sentence) are described 
as the translation results, i.e., a record of which 
candidate words were used for translating the 
individual words in a document. 

Employed is a re-translation result change 
evaluation method whereby, after a document 
has been translated, a discourse dictionary is 
compared with the translation, and a count is 
acquired of the words translated into words oth- 
er than candidate words that have the highest 
preferences in the discourse dictionary, so that 
the number of locations at which the candidate 
words are changed through re-translation is ac- 
quired. 

For re-translation of a document, employed is 
a method for increasing the efficiency of a re- 
translation process, whereby onlythose sen- 
tences are processed for which the translation 
will be changed. 

* According to an embodiment of the present inven- 
tion, employed is a candidate word selection mech- 
anism that employs a co-occurrence dictionary with 
which, when as candidate words there are n specific 
words, a minimum of one translation can be desig- 
nated. 

Example: 

House NOUN kain; Senate NOUN jooin; 
bank NOUN dote; river NOUN; 

[0027] In the first example, when nouns House and 
Senate appear in the same text (normally in the same 
sentence; but the range may be expanded to an entire 



Employed is a method for adjusting the learning 50 
function of a personal dictionary, whereby the 
preferences in an updated discourse dictionary 
are first selected to merge a plurality of dis- 
course dictionaries. 

55 

Employ d is a dictionary mployment m thod 
whereby, when a compound dictionary is not 
employed to translate a specific document, a 
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paragraph or an entire document), the translations for 
these words are defined respectively as "kain" and "joo- 
in." In the second xample, when nouns "bank" and "riv- 
er" appear in the same text, the translation of "bank" is 
defined as "dot 

[0028] Since these conflicts are reflective, when n 
words are included in one dictionary entry, candidate 
words for a maximum of n words are determined. When 
one entry is pertinent, all the words included in the entry 
appear. Thus, without taking the designation of the con- 
flicting candidate words into account, an n-times search 
of the co-occurrence dictionary for each word is not re- 
quired. 

[0029] The preferred embodiment of the present in- 
vention will now be described while referring to the ac- 
companying drawings. Fig. 5 is a schematic diagram il- 
lustrating a hardware arrangement of a translation sys- 
tem according to the present invention. A system 100 
includes a central processing unit (CPU) 1 and a mem- 
ory 4. The CPU 1 and the memory 4 are connected via 
a bus 2 and an IDE controller 25 to a hard disk drive 13 
(or to a storage medium driver, such as a CD-ROM 26 
or a DVD 32), which serves as an auxiliary storage de- 
vice. Further, the CPU 1 and the memory 4 are connect- 
ed via the bus 2 and a SCSI controller 27 to a hard disk 
drive 30 (or to a storage medium driver, such as an MO 
28. a CD-ROM 29 or a DVD 31). which also serves as 
an auxiliary storage device for storing a dictionary, etc. 
A floppy disk drive 20 is connected via a floppy disk con- 
troller 19 to the bus 2. 

[0030] A floppy disk is inserted into the floppy disk 
drive 20. Computer program code or data for cooperat- 
ing with an operating system to instruct the CPU to ex- 
ecute the present invention is stored on the floppy disk 
and the hard disk drive 1 3 (a storage medium, such as 
an MO. CD-ROM or a DVD), and in a ROM 14. For ex- 
ecution, the program code and the data are loaded into 
the memory. The computer program code may be com- 
pressed, or may be divided into a plurality of segments 
for storage in a plurality of storage media. 
[0031] The system 100 also includes user interface 
hardware, has a pointing device (a mouse or a joystick) 
7 or a keyboard 6 for data input, and employs a display 
1 2 for visually providing data for a user A printer can be 
connected to the system through a parallel port 16, and 
a modem can be connected through a serial port 1 5. For 
communication with another computer, the system 100 
can be connected to a network, through the serial port 
15, by the modem or a communication adaptor 18 (an 
Ethernet or a token ring card). A remote controlled trans- 
c iver for the exchange of data using infrared rays or 
wire can be connected to the serial port 1 5 or the parallel 
port 16. 

[0032] From an amplifier 22, a loudspeaker 23 re- 
ceives an audio signal obtained by a D/A (digital/analog) 
conversion p rformed by an audio controller 21, and 
outputs it as sound. The audio controller 21 converts an- 
alog audio data, received from a microphone 24, into 
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digital data, and also f tches external audio data Into 
the system and translates the data in cooperation with 
sound recognition software. 

[0033] From the above description, It should be easily 

5 understood that the translation system of the pr sent in- 
vention can be implemented by employing a common 
personal computer (PC); a workstation; a notebook PC; 
a palmtop PC; a network computer; various electronic 
home appliances, such as a television incorporating a 

10 computer; a game machine having a communication 
function; a communication terminal having a communi- 
cation function, such as a telephone, a facsimile ma- 
chine, a portable telephone, a PHS or a personal digital 
assistant; or a combination of these devices. The above 

15 described components are merely examples, and not all 
of them are required for the present Invention. 
[0034] A plurality of dictionaries and various types of 
buffers may be located in the memory 4, but usually, the 
memory 4 is used as a storage buffer for storing a dis- 

20 course dictionary, translation results and a co-occur- 
rence dictionary, and the hard disk 30 is used as a sec- 
ondary storage device for storing a personal dictionary, 
etc. Compound words may be Included In the co-occur- 
rence dictionary, and when the origin (e.g., "singular 

25 form") is defined for each word, together with a part of 
speech it may be described as a limitation. Generally, 
the co-occurrence dictionary is constituted by entries 
that each include two or more phrases, and their limita- 
tions and translations. Though there are an upper case 

30 and a lower case for a word, the word is supposed to be 
able to match either the stem of the word or the inflexion 
of the word. When the part of speech is omitted, the word 
may match a word having a desired part of speech. 

35 * The structure of the co-occurrence dictionary is as 
follows. Elements surrounded by square brackets, 
[ ], are optional. 

[Priority:] co-occurrence word 1 [part of 
speech] [translation]; co-occurrence word 2 [part of 

40 speech] [translation]; . . . 

* The structure of the discourse dictionary (discourse 
dictionary buffer) Is as follows. 

headword (word in a source language); 

45 

candidate 1: preference 
candidate 2: preference 

candidate n: preference 

50 

* The structure of the translation result recording buff- 
er is as follows, headword (word in a source lan- 
guage): 

55 candidate 1 : sentence translation No. 1 to sen- 

t nc translation No. ml (m) 
carididate 2: sentence translation No. 1 to sen- 
tence translation No. m2 (m2) 
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candidate n: sentence translation No. 1 to sen- 
tence translation No. mn (rnn) 

(Note: m, m2, . . and nnn are the frequencies of 
the individual candidate words). 

* The structure of a personal dictionary Is the same 
as the discourse dictionary. 

[0035] Fig. 1 is a flowchart for the processing per- 
formed according to the present invention. First, at step 
110, beginning with the first sentence of an Input docu- 
ment, the translation system extracts and processes 
sentences, one by one. At step 120, a word string that 
constitutes a sentence to be processed is examined to 
find relevant compound words. When a word string is 
found corresponding to a compound word, the informa- 
tion for the relevant compound word (the character 
string of the compound word in the source language and 
th character string for the translation into a target lan- 
guage) is used for a discourse dictionary registration 
process at step 150. 

[0036] At step 130, the single-word process is per- 
formed for the words that are not relevant to the com- 
pound word at step 1 20. When the translation of a single 
word is decided by conducting a search of the discourse 
dictionary or the co-occurrence dictionary, the transla- 
tion is used. When the translation of the single word is 
not decided, that word Is transmitted to a personal dic- 
tionary where a search for it is performed. When the 
translation Is returned, it is employed, and when no 
translation is returned, the word is output to the single- 
word dictionary where another search is performed to 
obtain the translation. The object word and its transla- 
tion are output for the translation result recording proc- 
ess. At step 140. a check is performed to determine 
whether the object word is present as a headword in the 
translation result recording buffer. When the object word 
is not present, a headword for the object word is created, 
and the candidate word, the sentence number and an 
appearance frequency of 1 are stored in the translation 
result recording buffer. When the object word is present 
in the buffer, a check is performed to determine whether 
a candidate word for the object word is present. If a can- 
didate word is present, a new sentence number is added 
to the old one, and the appearance frequency is incre- 
mented by one. When no candidate word is present, a 
candidate word, the sentence number and an appear- 
ance frequency of 1 are stored in the translation result 
recording buffer. 

[0037] When, at step 160, all the words that are not 
relevant to the compound word have been processed, 
program control moves to step 170 whereat the trans- 
lation process is activated upon the receipt of candidate 
words for all the words. Wh n all the words have not yet 
been processed, program control returns to step 120. 
[0038] At step 170. the translation process is per- 



formed, which is a conventional machine translation 
process. At step 170, morpheme analysis, grammar 
analysis, or another desired translation method may be 
performed. The difference in the contents of the ma- 

5 chine translation process does not aff ct the subject of 
the present invention. At step 180, a check is performed 
to determine whether all the sentences in a document 
have been translated. When all the sentences have 
been translated, a re-translation effect evaluation proc- 

70 ess is performed. Finally at step 195, a personal dic- 
tionary registration process is performed. During this 
process, the contents of the discourse dictionary buffer 
are written over the contents of the personal dictionary. 
At this time, preference values are added so that a pri- 

is ority can be given to the preference held in the discourse 
dictionary buffer. For example, for each headword In the 
personal dictionary, the preferences for the individual 
candidate words are normalized so that the total is con- 
stant When the same candidate word is present for the 

20 same headword in the discourse dictionary buffer, the 
preference is added to again normalize the preferences 
in the personal dictionary. The re-translation process at 
step 190 and the personal dictionary registration proc- 
ess at step 1 95 are riot always needed. However, by 

25 using these processes the accuracy of the translation 
can be Increased. 

[0039] Fig. 2 is a detailed flowchart for the discourse 

dictionary registration process (step 150). 

[0040] At step 210, of the received compound word 

30 information, a character string for a compound word in 
a source language is separated into words, which are 
then transmitted to a single-word dictionary where a 
search is performed to obtain candidate words for the 
individual words. At step 220, the character string for 

35 each candidate word is compared with a character string 
in a target language, which is the translation of the com- 
pound word, to obtain the n umber of identical characters 
both in the candidate word character string and the tar- 
get language character string. That character count Is 

40 divided by the number of character strings for the can- 
didate word, and the obtained value is used as a match- 
ing value for the compound word translation. At step 
230, the candidate word is selected having the value 
that most nearly matches that of the compound word 

45 translation. At step 240 a check is performed to deter- 
mine whether the matching value exceeds a threshold 
value set in advance. If it is larger, at step 250 the infor- 
matbn for the candidate word is registered in the dis- 
course dictionary buffer. A plurality of candidate words 

50 may be registered, and when there is more than one 
candidate word, the following process is repeated. 

When an object to be registered remains, the pref- 
erence to be registered in the discourse dictionary 
55 is calculated. The value for matching with that of the 
compound word translation Is multiplied by the co- 
efficient according to the word length of the com- 
pound word, and the resultant value is used as the 
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preference. For example, (preference) = (matching 
value with compound word translation) x (square 
root of word count of compound words - 1 ). 

When an object to be registered remains, a check 
is performed to determine whether the word in the 
source language is included with the headword in 
the discourse dictionary. If not, the candidate word 
and the obtained preference are registered. If the 
word in the source language is included with the 
headword, a check is performed to determine 
whether or not the candidate word to be registered 
has already been listed among the candidate words 
that are registered. If the word has been listed, the 
obtained preference is added to the preference that 
is already registered. If the word has not been listed, 
the candidate word and the obtained preference are 
registered. 

[0041] At step 260 a check is performed to determine 
whether there is another candidate word that has a most 
nearly matching value. If such a candidate word re- 
mains, the processes following step 250 are repeated. 
[0042] Fig. 3 is a detailed flowchart explaining the 
above described single-word process (step 130). Ac- 
cording to the priority of the discourse dictionary and the 
co-occurrence dictionary, following two processes are 
initiated. 

[0043] When priority is given to the designation of the 
candidate word using the co-occurrence dictionary rath- 
er than the discourse dictionary, the decision at step 310 
is YES, and program control moves to step 340. At step 
340, when words that appear in the sentence (or in a 
specific context) are defined as w1 ; w2; . . . ; wn, the co- 
occurrence dictionary is examined for a word, beginning 
at w1 , for which the translation has not yet been estab- 
lished. In the co-occurrence dictionary, words for which 
all of the translations have been designated are used as 
headwords, and a set of entries by which the relation- 
ships of the words are described are stored in corre- 
spondence with the headwords. When a specific entry 
in the co-occurrence dictionary is applied, at step 395, 
the designated translations of all the words included in 
the entry are stored. A co-occurrence dictionary entry is 
applied only once for each word, even if it has a plurality 
of matching relationships. As a result, the designations 
for the translations available for each word can be per- 
formed by n accesses of the co-occurrence dictionary. 
When designated candidate words for one word conflict, 
the standard must be applied that priority is given to the 
candidate word for which a higher priority is designated 
in the co-occurrence dictionary, or to a candidate word 
in an entry that has the highest number of cooccurrence 
words. When there is no relevant entry recorded in the 
co-occurrence dictionary, at step 350 the discourse dic- 
tionary search process is performed. 
[0044] When priority is given to the designation of the 
candidat word using the discourse dictionary rather 



than the co-occurrence dictionary, the decision at step 
310 is NO and program control moves to step 320. At 
step 320 a check is performed to d termine wh ther a 
given word is present as a headword in the discourse 

5 dictionary. If such a word exists, at step 390 the candi- 
date word having the highest priority is selected (if there 
are a plurality of candidate words, only one word is se- 
lected). If there is no entry relevant to the discourse dic- 
tionary, program control goes to at step -330, whereat 

10 the co-occurrence dictionary search process is per- 
formed. 

[0045] Fig. 4 is a detailed flowchart for the re-transla- 
tion process (step 190). At step 410 an evaluation value 
for the effect of the re-translation is calculated. Each 
IS headword in the discourse dictionary is processed, and 
a check is performed to determine whether each head- 
word is registered in the translation result recording buff- 
er. If all have been entered in the buffer, a check is per- 
formed to determine whether candidate words for the 
headword, other than a candidate word that has the 
highest priority, are present in the translation result re- 
cording buffer. If there are such candidate words in the 
buffer, the total of their appearance frequencies and all 
the sentence numbers are stored, and program control 
thereafter moves to the next headword process. When 
all the headwords available in the discourse dictionary 
have been processed, the level of the re-translation ef- 
fect is evaluated by using the stored ratio of the appear- 
ance frequency to the number of sentences in the doc- 
ument. When, at step 420, the level of the re-translation 
effect exceeds a threshold value that is set in advance, 
program control advances to step 430. If the level of the 
re-translation effect is below that of the threshold value, 
the processing is thereafter terminated. At step 420, the 
level of the re-translation effect may be provided for a 
user, so that the user can express a desire for the re- 
translation. The process at step 430 and the following 
steps is the re-translation process for all the sentence 
numbers that have been stored. This process is the 
same as those at steps 120. 130, 160. 170 and 180, 
except for the discourse dictionary registration process 
(step 150) and the translation result registration process 
(step 140). 

[0046] Conventionally for the translation of a single 
word, unless a special translation mechanism is em- 
ployed, either the first candidate word in the system dic- 
tionary is employed, or a candidate word selected by a 
user during the post -editing process is employed. When 
the co-occurrence dictionary of the present invention is 
employed, the translation of words according to the con- 
text can be provided without a complicated process, 
such as the grammatical description process, being per- 
formed. This is very effective not only for a person in 
charge of the development of a translation system, but 
also for a common user because it serves as a resource 
that compensates for a user's dictionary and designates 
more appropriate candidate words. 
[0047] Further, in the translation system of the present 
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invention, during the translation of a document priority 
information for candidate words is stored as the dis- 
course dictionary, which is later used as a personal dic- 
tionary, so that the priority for a candidate word can be 
automatically learned. 

Claims 

1. A translation system for performing translation us- 
ing a plurality of dictionaries, comprising: 

(a) means for registering in a discourse diction- 
ary, during the translation of a document by us- 
ing a compound word dictionary, elemental 
word information of an applied compound word; 
and 

(b) means for employing a plurality of diction- 
aries, including said discourse dictionary, in or- 
der to translate a word in the document that is 
not defined in the compound word dictionary. 

2. The translation system according to claim 1 , where- 
in the plurality of dictionaries are single word dic- 
tionaries, co-occurrence dictionaries, discourse 
dictionaries or personal dictionaries, or a combina- 
tion of those dictionaries. 

3. The translation system according to claim 1 , where- 
in the means for registering includes: 

means for determining, when a candidate word 
to be registered in said discourse dictionary is 
to be selected as the elemental word informa- 
tion, a translation for elemental words of a com- 
pound word to be described in a discourse dic- 
tionary, for comparing a candidate translation 
obtained from a single word dictionary for the 
elemental word with a candidate translation for 
the compound word, and for selecting the can- 
didate word that has the most nearly identical 
character string portion. 



in said pr f rence is calculated by a ratio of an iden- 
tical portion of a candidate word character string to 
a compound word translation character string, and 
by a coefficient obtained from the length of a com- 
s pound word. 

7. The translation syst m according to claim 6, where- 
in when the same candidate word has been regis- 
tered with the same headword in a discourse dic- 

10 tionary, a new preference value is added to a pref- 
erence value that has already been provided. 

8. The translation system according to claim 2, where- 
in the co-occurrence dictionary is a co-occurrence 

15 dictionary with which, when as candidate words 
there are n specific words, a minimum of one trans- 
lation can be designated. 

9. The translation system according to claim 8, where- 
to in priority is designated for the discourse dictionary 

and the co-occurrence dictionary, and the diction- 
aries are employed for translation in accordance 
with higher priority. 

25 1 0. The translation system according to claim 2, further 
comprising: 

(c) means for, to translate a specific document, 
recording candidate words into which individual 

30 words in a document were translated; and 

(d) means for re-translating the document by 
using the plurality of dictionaries including a 
discourse dictionary that is generated when the 
translation is completed. 

35 

1 1 . The translation system according to claim 2, further 
comprising: 

(e) means for preparing a discourse dictionary 
40 consisting of units of translated sentences, and 

for merging the discourse dictionary to create 
an automatic learning personal dictionary. 



4. The translation system according to claim 3, further 
comprising: 

means for canceling registration of a compound 
word in a discourse dictionary when the ratio of 
the identical character string portion in the can- 
didate word does not exceed a threshold value. 50 

5. The translation system according to claim 2, where- 
in the means for employing selects and translates 
a candidate word to which the highest pr ference 

is given among candidate words registered in the 55 
discourse dictionary. 

6. Th translation system according to claim 5, wh re- 



12. A translation method for performing translation us- 
ing a plurality of dictionaries, comprising the steps 
of: 

(a) during the translation of a document by us- 
ing a compound word dictionary, registering in 
a discourse dictionary, elemental word informa- 
tion of an applied compound word; and 

(b) employing a plurality of dictionaries, Includ- 
ing the discourse dictionary, in order to trans- 
late a word in the document that is not defined 
in the compound word dictionary. 

1 3. A storing medium for storing a program for perform- 
ing translation using a plurality of dictionaries, the 
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program comprising: 

(a) a function for, during th translation of a doc- 
ument by using a compound word dictionary, 
regist ring in a discourse dictionary, elem nta! s 
word information of an applied compound word; 
and 

(b) a function for employing a plurality of dic- 
tionaries, including the discourse dictionary, in 
order to translate a word in the document that io 
is not defined in the compound word dictionary. 
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